The Problem with Manual Firewall Management

If you've been managing NSX Distributed Firewall policies through the GUI, you know the pain. Tracking who changed what, keeping policies consistent across environments, and managing multiple tenants quickly becomes a mess. I've been there - clicking through the NSX Manager interface, copy-pasting rules between environments, and hoping nothing breaks in production.

There had to be a better way, so I built a framework to handle this properly.

Enter NSX-DFW-Framework

I recently published NSX-DFW-Framework - a Terraform-based approach to managing NSX firewall policies as code. The core idea is simple: your security policies should live in YAML files that get version controlled, reviewed, and deployed like any other infrastructure code.

No more mystery about when a rule was added or who approved it. Your Git history becomes your audit trail.

How It Actually Works

The framework uses a tenant-centric model. Each tenant gets two YAML files:

inventory.yaml - This is where you define your infrastructure:

  • VMs and their roles (web servers, databases, etc.)
  • Consumer and provider groups (which apps need to talk to what)
  • IP groups for external systems
  • Custom services if you need protocols beyond the standard ports

authorized-flows.yaml - This defines your actual security policies:

  • Which consumers can access which providers
  • What protocols and ports are allowed
  • Emergency access rules for break-glass scenarios

The framework handles the hard parts - maintaining rule order in NSX, creating hierarchical tags, and processing policies in the right sequence.

Policy Ordering That Makes Sense

Here's something that trips people up with NSX - rule order matters. A lot. The framework handles this with three policy levels:

  1. Emergency policies (Sequence 1) - Your break-glass rules go here
  2. Environment policies (Sequence 2) - Cross-environment traffic controls
  3. Application policies (Sequence 3+) - Processed in exactly the order you define them in YAML

When you reorder rules in your YAML file, the framework updates NSX accordingly without destroying and recreating policies. This was surprisingly tricky to get right with the Terraform provider.

Real-World Example

The repository includes a reference tenant (wld01) that shows a typical setup:

  • Three-tier web application (web, app, database tiers)
  • Active Directory infrastructure
  • Monitoring systems that need broad access
  • Development environment with different security requirements

You can copy this structure and adapt it to your environment. The naming conventions are documented, but the basic pattern is straightforward - organize by tenant, environment, application, and component.

Getting Started

If you want to try this out:

git clone https://github.com/vkernel/NSX-DFW-Framework.git
cd NSX-DFW-Framework
cp terraform.tfvars.example terraform.tfvars

Edit terraform.tfvars with your NSX Manager details, then:

terraform init
terraform plan  # Always review what will change
terraform apply

Start with the reference tenant to understand how everything fits together, then create your own tenant files.

Things to Know

A few gotchas I ran into while building this:

VM Names Must Match The VM names in your YAML files need to match exactly what's in NSX inventory. If NSX sees "web-server-01" and your YAML says "web-server-1", it won't work.

Emergency Policies and NSX Projects Emergency policies don't function properly within NSX Projects due to platform limitations. If you're using Projects for additional isolation, you'll need to handle emergency access differently.

ALG Services The Terraform provider has issues with ALG services, so they get rendered as regular TCP services. Not ideal, but it works for most cases.

Why This Approach

You might wonder why not use NSX-T Policy API directly or some other tool. I tried those approaches. Terraform gave the best balance of:

  • State management (knowing what's currently deployed)
  • Change planning (seeing what will change before applying)
  • Provider ecosystem (easy to integrate with other infrastructure)
  • Team familiarity (most infrastructure teams already know Terraform)

The framework is opinionated about structure because that's what makes it maintainable. You can customize the YAML schema if needed, but the default patterns work well for multi-tenant environments.

What's Next

This is the first public release. I'm using it in production, but there's room for improvement:

  • Better validation of YAML before Terraform runs
  • Support for more complex context profiles
  • Enhanced CI/CD examples
  • Performance optimizations for large rule sets

If you're managing NSX security at scale, give it a try. The repository includes detailed documentation, troubleshooting guides, and the full reference implementation.

And if you find bugs or have suggestions, open an issue on GitHub.